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ABSTRACT 

We examine the dynamics of the stars and globular clusters in the nearby giant 
elliptical galaxy M87 and constrain the mass distribution, using all the available data 
over a large range of radii, including higher-order moments of the stellar line-of-sight 
velocity distributions and the discrete velocities of over two hundred globular clusters. 
We introduce an extension of spherical orbit modeling methods that makes full use of all 
the information in the data, and provides very robust constraints on the mass models. 
We conclusively rule out a constant mass-to-light ratio model, and infer that the radial 
density profile of the galaxy's dark halo falls off more slowly than r~^, suggesting that 
the potential of the Virgo Cluster is already dominant at r ~ 300" ~ 20 kpc. 



Subject headings: galaxies: elliptical and lenticular, cD — galaxies: halos — galaxies: 
individual (M87) — galaxies: kinematics and dynamics — galaxies: star clusters — 
galaxies: structure — globular clusters: general 



1. INTRODUCTION 



Elliptical galaxies lack the dynamical simplicity of spiral galaxies, posing well-known challenges 
for determining their intrinsic properties {e.g., mass distribution, shape, orbit structure). The 
projected stellar velocity dispersion radial profile, ap{R), is easily measured, but is subject to a 
degeneracy between the mass distribution and the orbital anisotropy, where radial variations in the 
stellar orbit types mask or mimic changes in the mass-to-light ratio {M/L) (Binney & Mamon 1982; 
Tonry 1983). Crucial for resolving this degeneracy is the measurement of higher-order moments of 
the stellar line-of-sight velocity distributions (LOSVDs). Such LOSVD information has been used 
to constrain the masses of central supermassive black holes (van der Marel et al. 1998; Emsellem, 
Dejonghe, & Bacon 1999; Cretton & van den Bosch 1999; Gebhardt et al. 2000) and dark halos 
(Carollo et al. 1995; Rix et al. 1997; Gerhard et al. 1998; Saglia et al. 2000; Kronawitter et al. 
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2000). Unfortunately, the surface brightness of an elHptical dechnes rapidly with radius, making 

these measurements increasingly difficult at the large radii where the dark matter problem becomes 
most interesting 2 effective radii -Reff)- Other techniques are necessary to probe this outer mass 
distribution. 

In nearby galaxies, the two main candidates for such probes are X-ray emission from extended 
hot gas (see Fabbiano 1989 §4.4 for a review), and the discrete velocities of either planetary nebulae 
{e.g., CiarduUo, Jacoby, &; Dejonghe 1993; Tremblay, Merritt, & Williams 1995; Hui et al. 1995; 
Arnaboldi et al. 1998) or globular clusters {e.g., Cohen &; Ryzhov 1997, hereafter CR; Kissler-Patig 
et al. 1999; Zepf et al. 2000). However, the analysis of discrete kinematical measurements calls 
for more refined dynamical models than cither simple virial estimators {e.g., Heisler, Tremaine, & 
Bahcall 1985; Kent 1990; Haller & Melia 1996) or binning the data and using the Jeans equations 
{e.g., Federici et al. 1993; Grillmair et al. 1994; Tremblay et al. 1995). These procedures do not 
guarantee a physical solution, make unwarranted assumptions about the orbital anisotropy, and do 
not make full use of the constraints provided by the data. The first two deficiencies are alleviated 
by techniques employing distribution function (DF) basis sets that permit the full freedom of 
orbit types {e.g., Merritt, Meylan, & Mayor 1997; Mathieu & Dejonghe 1999; Saglia et al. 2000). 
But these techniques typically bin the data in radius and velocity, destroying potentially useful 
information, such as the maximum observed velocity as a direct constraint on the escape velocity. 
More general methods are needed to make better use of the information contained in the discrete 
velocity data; Merritt &; Saha (1993) and Merritt (1993) have demonstrated one such approach 
using DF basis sets and maximum likelihood methods. We develop a general dynamical method 
based on orbit modeling, and apply it to a real galaxy, M87. 

The giant central Virgo Cluster galaxy M87 (= NGC 4486), although not a typical elliptical, 
is nevertheless a prime candidate for modeling because of the abundance of data available on it. 
In addition to high-precision measurements of its higher-order stellar velocity moments and of 
the radially-extended velocity dispersion, it has the largest available sample of globular cluster 
(GC) velocity measurements. The stellar velocity measurements extend out to ~ 1.5i?eff, and 
the GC measurements to ~ 5i?eff, probing well into the area where the dynamics are presumably 
dominated by the dark halo of the galaxy or of the Virgo Cluster. The combination of these 
kinematical constraints should provide robust limits on the dark matter distribution, and on the 
orbital structures of both the stars and the GCs. 

While a variety of modeling techniques have been used to study the stellar dynamics of M87 
(Sargent et al. 1978; Duncan & Wheeler 1980; Binncy & Mamon 1982; Newton & Binney 1984; 
Richstone &; Tremaine 1985; Dressier &: Richstone 1990; Tenjes, Einasto, & Haud 1991; van der 
Marel 1994; Merritt &; Oh 1997), they concentrated on its central regions in order to constrain 
the mass of a central black hole. The stellar dynamics of its halo have not been modeled in 
detail, nor have the higher-order velocity moments been used as strict constraints. But the best 
available tracers of the halo mass distribution are ultimately the GCs and the hot gas, which can 
be easily observed at large radii. Some simple dynamical methods have been used to estimate 
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the halo mass using the GCs (Merritt & Tremblay 1993; CR). They claim that the radially-rising 

velocity dispersion profile rules out a constant M/L model, and that it suggests that the GCs 
are in fact tracing the Virgo Cluster potential rather than M87's own potential (as suggested also 
by analyses of the X-ray emission and the kinematics of the cluster galaxies; e.g., see Nulsen 
k, Bohringer 1995; McLaughlin 1999). But a more rigorous dynamical model, allowing for the 
systematic orbital uncertainties, is necessary to confirm these findings. We will apply our general 
orbit modeling method, using a large sample (~ 200) of discrete GC velocities, to see how the GCs 
— alone and jointly with the stellar dynamics — constrain the mass distribution. In §2 wc detail 
the observational constraints for M87. We describe our modeling methods in §3, §4 presents the 
results, and the conclusions are in §5. 

2. OBSERVATIONAL CONSTRAINTS 

Here we summarize the observational constraints on M87. These include the stellar surface 
brightness (§2.1); velocity measurements of the stars (§2.2); the globular cluster surface density 
(§2.3); and globular cluster discrete velocity measurements (§2.4). We assume a distance to M87 
of 15 Mpc (Pierce et al. 1994), so that 1" = 73 pc and V = 4.4 kpc. The stellar effective radius is 
Eeff ^ 100" (Burstein et al. 1987; Zeilinger, M0ller, & Stiavelh 1993). 

2.1. Stellar Surface Brightness 

For the stellar surface brightness radial profile ii{R) in the core region (ii=0'.'01-8") we use the 
HST I -hand data of Lauer et al. (1992). We use their secing-deconvolved profile, and approximate 
the photometric uncertainties by A/Lt = max[0'.'011/i?, 0.02]. For the outer regions (i?=8"-745"), we 
use the S-band data of Capaccioli, Caon, & Rampazzo (1990), with the photometric uncertainties 
estimated from their Figure 3, and with an offset of 2.78 magnitudes to match the profile to the 
Lauer et al. (1992) data. The color gradients are quite small over the entire range of the galaxy 
(|A(i? — /)|max < 0.2 mag; e.g., Boroson et al. 1983; Cohen 1986; Zeilinger et al. 1993), so 
there should be little systematic problem in combining the /-band and i?-band data. The isophote 
ellipticity varies from 0.02 at 2" to 0.1 at 75" to 0.35 at 500". The dynamics of the galaxy depend 
on the ellipticity of its gravitational potential, which is much rounder than its density distribution 
(e$ ~ ep/3; Binney & Tremaine 1987 §2.3.1), so it will be a reasonable approximation to treat 
the galaxy as spherical. We map all of the data from the major axis a to the intermediate axis 
m = a\/l — e- 

For the purposes of our mass modeling, we project and fit a parameterized luminosity density 
model to the data of the form Vi,{r) = voir / si)~°^^ {1 + r^/s2)^"^(l + ?'/'''3)^"'S and find = 
9.9 X IQ^Lb,q arcscc-^ ai = 1.22, 0.2 = 0.361, = 1.26, .si = 1". .ss = 9'.'38, and .S3 = 170", 
with a statistic for the fit of 35.1 for 142 degrees of freedom. This functional fit will only 
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be used to generate a constant M/L gravitational potential, while we will directly fit the surface 
brightness data in our dynamical models (§3). Note that in M87, /u(i?) does not taper off steeply 
in the outer regions as in normal R^^^ ellipticals — a typical cD envelope (Schombcrt 1986) 
begins at i? ^ 300". An old dispute over the brightness of this envelope (see e.g., de Vaucouleurs 
&; Nieto 1978; Carter k. Dixon 1978) has not been resolved, presumably because of sky-subtraction 
problems for such a large, low-surface-brightness structure. Our asymptotic radial slope lies between 
the reported extremes. 

2.2. Stellar Velocities 

For the core region {R < 29"), we use the G-band data from van der Marel (1994, hereafter 
vdM) for the projected stellar velocity dispersion (Tp{R) and the Gauss-Hermitc velocity moments 
hi{R) and hQ{R) (sec §3). For the outer regions (i?=28"-168"), we use crp(R) from Scmbach & 
Tonry (1996, hereafter ST). We combine the data from positive and negative radii into one radial 
profile. There appears to be a systematic velocity offset between the ST data and most other 
data (Sargent et al. 1978; Davies Sz Birkinshaw 1988; Jarvis &; Peletier 1991; Winsall & Freeman 
1993), presumably due to the large slit width; we estimate that this corresponds to an additional 
instrumental dispersion of 183 it 11 km s~^, which we remove from the ST data in order to match 
them to the vdM data. The final velocity dispersion profile is nearly constant for R < 1", outside of 
which it falls off slowly with radius (a^ ~ R~^'^). Comparing /14 and he from positive and negative 
radii, we find differences larger than are consistent with the stated uncertainties. We assume the 
errors were underestimated by 12% and 22% to bring the profiles into statistical agreement. The 
departures from Gaussianity of the LOSVD are everywhere small (typically, |/i4|, |/i6| < 0.02). 

2.3. Globular Cluster Surface Density 

Our GC surface density radial profile N{R) is taken from the number counts of Kundu et al. 
(1999) for R = 0"-96" (with their quoted uncertainties). For R = 84"-472", wc take the data from 
McLaughlin, Harris, &: Hanes (1993), adding in their 0.4 arcmin^^ background uncertainty, and 
multiplying by 1.45 to match the normalization of the Kundu et al. (1999) data. For R = 419"- 
1351", we use the data from Harris (1986) after subtracting their background count of 5.8 it 0.3 
arcmin"^ and normalizing by a factor of 2.19. We do not convert any of these profiles to an 
intermediate axis since this is already effectively accomplished by the derivation method (counting 
in circular annuli). Like the stellar surface brightness, the GC surface density decreases slowly 
with radius in the central regions (~ R~^'^), and changes over to a steeper power law in the outer 
parts (~ R~^-^); however, the radius where this break occurs is much larger for the GCs (~ 60" vs. 
~ 10"), demonstrating that these systems are dynamically distinct. 
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2.4. Globular Cluster Discrete Velocities 

For the GC line-of-sight velocities Vz, we use data from Mould, Oke, & Nemec (1987), Huchra 
& Brodie (1987), Mould et al. (1990), CR, and Cohen (2000). Most of the positions are taken from 
Strom et al. (1981). We compare the velocities of common objects to determine the systematic 
offset between the data sets, and to estimate the measurement uncertainties {Avz ~ 100 km s~^). 
We discard as foreground stars all objects with heliocentric velocities < 250 km s~^, and as 
background galaxies all those with Vz > 2650 km s^^. Wc examine the colors and magnitudes of 
objects with Vz < 500 km s~^ to distinguish GCs from stars. Objects of uncertain identification 
are discarded. We map the data to the intermediate axis m, using ellipticity estimates from 
McLaughlin, Harris, &; Hanes (1994). Our final data set has 234 velocities from R = 25"-526" (see 
Fig. 1). 
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Fig. 1. — Line-of-sight velocities for 234 globular clusters around M87, as a function of galacto centric radius 
R. The dotted line shows a systemic velocity of 1300 km s~^. 
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There is a paucity of low velocity measurements at large radii, which may be related to the 
rotation of the GCs along the galaxy's major axis in the outer parts, as demonstrated by Kissler- 
Patig & Gebhardt (1998). For the GCs with R > 370", there were only 4 measurements taken on 
the "approaching" side and 23 on the "receding" side. Thus, the large-radius velocity asymmetry 
is probably due to incomplete spatial coverage, which may make our mean velocity for the system 
systematically high (it is decreased from 1352 to 1293 km s~^ if the clusters with R > 370" are 
omitted). This is also suggested by the global velocity distribution (Fig. 2), which appears more 
symmetric if the mean velocity is set to ~ 1300 km s~^. The best-fit Gaussian curve to the data 
has a central velocity of = 1323 it 28 km s^^ (1308 it 29 km s^^ if the outer clusters are omitted). 
We adopt a systemic velocity of 1300 km s~^, which we subtract from the measurements to get the 
galactocentric velocities. This compares well with estimates of the stellar mean redshift from vdM 
(1277 km s-^), ST (1293 km s'^), and the RC3 (1282 ± 9 km s"^; de Vaucouleurs et al. 1991). 
The large-radius velocity asymmetry will not pose any further difficulties because wc will model 
only the even part of the DF, so that all the discrete velocities are implicitly considered as present 
also at their reflected position about the mean velocity. 

Fig. 2 also suggests that the GCs have a tangentially-biased orbit distribution, which typically 
produces a flat or double-peaked LOSVD. We fit a fourth-order Gauss-Hermitc velocity moment 
(see §3) to the system, finding = — 0.04±0.02 (iTp = 404ibl6 km s~^), which is weakly suggestive 
of tangential anisotropy. A double-peaked LOSVD (especially evident at larger radii) has also been 
found in the GC system of the Fornax Cluster cD galaxy NGC 1399 (Kissler-Patig et al. 1999). 
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Fig. 2. — Distribution of globular cluster line-of-sight velocities, relative to a systemic velocity of 1300 km 
s~^. The histogram shows the number of data points in velocity bins. The solid curve shows a smoothed 
superposition of the measurements. The dotted curve shows the best- fit Gaussian curve to the data (cr = 376 
km s^^). Note the double peaks of low significance at ±200 km s~^ and at ±500 km s~^. 
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3. 



METHODS 



The method of orbit modehng was pioneered by Schwarzschild (1979) to reproduce the observed 
density distribution of a galaxy from a weighted hbrary of representative particle orbits. Given a 
fixed gravitational potential $(r), the solutions are nonparametric — no functional form is assumed 
for the DF — so that one need not worry that the models cannot fully explore the possible solutions. 
The DF is guaranteed to be nonnegative and physical. This method was extended by Richstone 
& Trcmainc (1984) to include projected velocity dispersion information, and by Rix et al. (1997) 
to include higher-order velocity moments. We used similar techniques to examine the uncertainty 
in a high-redshift galaxy's mass as implied by a measurement of its central velocity dispersion 
— an important step in determining Hq from certain gravitational lens systems (Romanowsky 
& Kochanck 1999, hereafter Paper I). Other recent orbit modeling efforts have concentrated on 
applying the method to three- integral axisymmetric systems (e.g., van der Marel et al. 1998; 
Cretton & van den Bosch 1999; Gcbhardt et al. 2000; Cretton, Rix, & de Zccuw 2000). Here we 
extend our spherical method to include the discrete velocities of GCs. The outline of the method 
is as follows. 

We begin with an assumed radial profile for the mass density distribution p{r). Our first model 
is a simple representation of a dark-matter dominated system, the singular isothermal sphere (SIS): 
p{r) = (7o/27rGr^. The second model is the constant M/L distribution described in §2.1, parame- 
terized by T^;: the ratio of the mass density p(r) and the I?-band luminosity density ;^^(r), with 
units of Mq/Lq^b- We pick a random distribution of orbits that densely samples the phase space 
of energy and angular momentum (E, |L|). The initial radii rofe of the orbits are logarithmically 
spaced in ro, and the energy Ek of each orbit is selected to correspond to that of a circular orbit 
at this radius, $(rojt) -|- v^{rok)/'^- For the SIS model, this procedure produces a sampling that is 
uniform in energy. The angular momentum Lj. of the orbit is selected randomly from the range 
[0,Linax,fc]) where Lmax.fe = i'okVc{rok)- For both the stellar models and the GC models, we use 
2500 particle orbits. For the stars, tq spans the range 0'.'07-4450" , resulting in a radial coverage of 
r = 0"-7330" (SIS model) or r = 0"-8710" (constant M/L model). For the GCs, ro =10'.'0-13360", 
corresponding to r = 0"-22030" (SIS model) or r = 0"-26420" (constant M/L model). 

We next compute the orbit projection "kernels" , which correspond to the contribution of each 
orbit to each observable. The predictions of the model can then be expressed as follows: 



where the w are the orbit weights, and the kernels K have been averaged over time and over all 
spherical-polar viewing angles {6, (p). For example, the kernel for the angle-averaged surface density 
in an annulus between Ri and R2 of an orbit at an instantaneous radius r' is given by the integral 
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where the orbit's instantaneous density is 5{r — r')/47rr^, the variable ^ represents the hne of 
sight's position in the tangential plane {0,(f)), and the integration is carried out along the line of 
sight z = i-y/r^ — i?2_ The viewing-angle integral (along d^) is nontrivial for kernels that involve 
velocity measurements. Placed at the initial radius ro, the orbit is run forward in time for one 
radial period Tr, and the final kernel is found by averaging over time: 



where the integration is handled by a Bulirsch-Stoer type integrator. 

We must calculate the model's LOSVD, dL/dvp{vp, R), to fit the observed stellar velocity 
moments (<Tp, /i4, /ie) and the GC velocity measurements. The LOSVD kernels -ft^dL/dt)p(i?),fe for a 
measurement at radius R are binned in velocity with Vp = [C^max], where fmax(^) is the velocity 
of a particle at radius r = R that has plunged inward on a radial orbit from the largest possible 
radius. For the constant M/L model, this implies that fmax — ^'csc (the escape velocity). Our 
models are symmetric in velocity, i.e., dL/dvp{vp) = dL/dvp{—Vp). By construction, f^ax scales 
linearly with the mass normalization of the galaxy model (ctq or T^), so that we can use the same 
binned LOSVD kernels when changing the mass scale. 

The Gauss-Hermite velocity moments hi, which measure deviations of a LOSVD from a Gaus- 
sian, are defined by 



where w = (vp—Vp) jdp, 70 is the line strength, (7p, Vp, dp) are the coefficients for the best Gaussian 
fit to dL/dvp, and Hi[w) are the Hermite polynomials (van der Marel & Franx 1993). The moments 
(/14, he) measure how fiat (/t4 < 0, > 0) or how peaked (/i4 > 0, /ie < 0) the LOSVD is 
compared to a Gaussian, where these signatures are typically produced by tangential and radial 
orbits, respectively. The two moments differ in how much weight they put in the high-velocity 
wings of the LOSVD. In Paper I, we fitted a Gaussian curve to the LOSVD to find (7, (Tp) at each 
step in our minimization routine, and then calculated /14; this is equivalent to solving the nonlinear 
equations ho = 1 and h2 = 0. Here, we expand the Gauss-Hermite series about the fixed data values 
o-p. Rather than directly fitting o-p, we equivalently fit the second-order moment /i2 = 0.00 it A/12, 
where A/12 — A(Tp/\/2(Tp . With this formulation, calculating the model's velocity observables is 
now linear with respect to the orbit weights. We use 41 velocity bins from i;p = to Wmax to 
numerically integrate equation (4), which we have heuristically found to give very good accuracy. 
Note that because our model is completely spherically symmetric (simulating only the even part of 
the DF), it would be more appropriate to fit measurements of 2:4 (expanded around Vp = 0) rather 
than /i4 (see van der Marel et al. 1994 §5.1); however, these measurements are not available for 
M87. Since the measured rotation is small (~ 10 km s~^), the difference would be unimportant. 





(4) 
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The kernels (KijK.^L/dvp) need be computed only once for each galaxy mass model. We then 
adjust the weights of the orbits so that the model's projected observables y™ (equation 1) best 
fit the data y*^. For most of the data (/, (Tp, /i4, /le); we express the likelihood of the fit using the 
standard statistic, 



X 



E 



y^ 



(5) 



where the measurement uncertainty is CTj, while the likelihood function for the discrete velocities 
and positions of the GCs is 

dL 



(6) 



where Vi it ai are the individual velocity measurements. In maximizing Ci, we are forcing the 
LOSVD to peak at the measured velocities Vi weighted by the measurement errors cxj. This is 
schematically illustrated in Fig. 3. Given complete freedom, this method would produce a best- 
fit solution whose LOSVDs resembled 5-functions at the measurements Uj. However, the model's 
averaging in angle and time produces an intrinsic smoothing to the LOSVDs, and does not permit 
such unphysical solutions. For the current implementation, we use 15 velocity bins from Vp = to 
Vmux for the LOSVD at each radius in order to compute the integral in equation (6). Note that our 
method does not require binning the velocities in radius, nor computing velocity moments. 
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Fig. 3. — Schematic line-of-sight velocity distributions {solid curves), fitted to discrete velocity data {dotted 
curves). The double-peaked LOSVD on the left is indicative of tangential anisotropy, while the centrally- 
peaked LOSVD on the right indicates radial anisotropy. 



The final function which we will minimize is 



where 5 is a measure of entropy: 



/(w) = -x'-^lnA + AS, 



S = ^ wl Inwl. 
k 



(7) 

(8) 
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We employ this smoothing function 5 as a heuristic device for quickly reaching a rough solution; in 
our final solutions, the Lagrangian factor A is reduced to such a low value (~ 10~^) so as to make the 
entropy constraints inconsequential. With no such regular ization imposed, this is an ill-conditioned 
inversion problem, so our DF solutions will be choppy in a way that real galaxies' DFs presumably 
are not. However, our methods are the statistically correct way to handle the uncertainties, absent 
any other a priori smoothness conditions. We use a conjugate gradient method (see Press et al. 1992 
§10.6) with first and second derivative information, to minimize /. We have tested our methods 
on an isotropic Hernquist (1990) model to demonstrate that we recover correctly the input mass, 
given constraints on I{R) and iJp(-R) (see Paper I). 



4. RESULTS 

Here we present the results of our modeling. First we investigate the necessity for dark matter 
in the system by comparing two simple models (§4.1). Then we consider more generalized mass 
models (§4.2). Finally, wc report constraints on the orbit anisotropics (§4.3). 



4.1. Test for Dark Matter 

We first fit the two simple mass models (SIS and constant M/L) to the stellar data [I{R), 
d-p{R), h4{R), he{R)] over a sequence of the mass parameters ctq and (see Fig. 4). Table 1 lists 

the best-fit values for these parameters and the difference in log-likclihood between the two models. 
For the SIS model, we find ctq = 272^5^ km s^-*^; the fit of the best solution to the data is shown in 
Fig. 5. For the constant M/L model, we find = 8.1 it 0.6. This model does not fit the data as 
well as the best SIS model does (la significance), providing weak evidence for a dark matter halo 
in this galaxy. 

In order to see which constraints are important for indicating the presence of dark matter, 
we have tried models which lack either the hQ{R) data or the large-radius d-p{R) data, (Table 1). 
Dropping these constraints makes little difference in differentiating between mass models (though 
the mass normalization of the SIS model does change in the latter case). In particular, we note that 
the extra velocity dispersion data is unhelpful, owing to the flexibility of the orbital anisotropics 
to reproduce an arbitrary velocity dispersion profile. Only with additional LOSVD information at 
large radii can the stellar kinematical data by itself further constrain the mass distribution. 
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Fig. 4. — The log-likelihood of the best-fit singular isothermal model {left panel) and constant mass-to-light 
ratio model {right panel), as a function of the mass parameters cto and T^. The filled squares show the fit 
to the stellar data, and the open squares the fit to the globular cluster data. 



Singular Isothermal Constant M/L 



Constraints (Tq (km s ^) Tb (Mq/Lq^b) Aln>C 



Stars: I, avdM, ^ST, ^4, K 


272tl^ 


8.1 ±0.6 


0.8 (1.0 cr) 


Stars: /, a^du, ^st, 


2751^9 


o -1+0.7 


0.6 (0.9 a) 


Stars: /, cJvdMj h^, Iiq 


254l?2 


8.311;? 


0.9 (1.0 a) 


Globular clusters 


mtll 


42.9lf;? 


12.9 (4.7 cr) 


Stars -|- Globular clusters 


3011^4 




206.0 (20.1 a) 



Table 1. — M87 model solutions, given different sets of observational constraints. The best-fit mass 
parameters {(Tq and T^) arc given for the SIS and constant M/L models. The difference in log- likelihood 
between the two models is shown, along with the statistical significance at which the SIS model is preferred. 
The "Baycsian" probability for the SIS model is (1 + e~'^''^'^)~^ ~ 1, while that for the constant M/L model 

is e-^'"^(l + e-'^l"^)-! ~ g-Aln£ foj. ^i^jr ^ 3. 
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Fig. 5. — Fits to the M87 stellar data {error bars) for the best singular isothermal model (ctq = 272 km s~^ : 

solid lines), the best constant mass-to- light ratio model (T^ = 8.1 : dotted lines), and the best NFW2 model 
(sec §4.2; dashed lines). The data arc the radial profiles of the surface brightness, the velocity dispersion, and 
the fourth- and sixth-order Gauss-Hermite velocity moments {from top to bottom). Note that ReS — 100". 
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We fit the GC data to the same two mass models (see Fig. 4 and Table 1). For the SIS 

model, we find ao = 409I24 ^™ ^"^5 °f best SIS solution to the data is shown in Figs. 

6-8. Note especially in Fig. 7 that some of the higher-order features of the observed LOSVDs are 
reproduced by the model. For the constant M/L model, we find = 43 zt 3. This model fits the 
data much worse than does the SIS model (5 a significance). Here, the constant M/L model cannot 
reproduce the radially-rising GC velocity dispersion (see Fig. 6). Also, Ts = 43 is implausible for a 
standard stellar population, requiring an age S> 17 Gyr and a metallicity [Fe/H] » 0.50 (Worthey 
1994), so that even if mass traced light, the mass could not consist of a standard stellar population. 
Finally, note that = 43 it 3 is completely inconsistent with 8.1 ib 0.6 as obtained via the stellar 
dynamics, unless there is a drastic radial gradient in the stellar population. We thus conclude 
that the constant M/L model is definitively ruled out for M87, assuming standard gravitational 
dynamics. (Note that the 20 a significance given in Table 1 is formal only, and that in practice, a 
number of systematic uncertainties will reduce this significance — but it can still be said that the 
result holds to a high significance level.) 
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FlG. 6. — Fits to the M87 globular cluster data {error bars) for the best singular isothermal model ( ctq = 409 
km s~^: solid lines), the best constant mass-to-light ratio model (T^ = 42.9: dotted lines), and the best 
NFW2 model (see §4.2; dashed lines). The data include the radial profiles of the surface density, the velocity 
dispersion, and the fourth-order Gauss-Hermite velocity moment {from top to bottom). Note that only the 
surface density is actually fit, while the other data are shown for comparative purposes only. 
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Fig. 7. — Line-of-sight velocity distributions for the M87 globular cluster system, in nine radial bins. The 
heavy solid lines show the LOSVDs of the best-fit singular isothermal solution (cto = 409 km s~^), averaged 
in each bin. The light solid lines show the data (26 points in each bin). The dotted lines show simulated 
LOSVDs derived from the superposition of the data in each bin, symmetrized about Vp = 0. Compare Fig. 
3. 
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Fig. 8. — Contour plot of line-of-siglit velocity distribution {I~^dl / dvp) for the best-fit singular isothermal 
globular cluster system model (ctq = 409 km s~^), normalized by the surface brightness at each radius R. 
The last contour represents zero intensity. The data are shown as squares. Note that the model LOSVD is 
peaked at small radii {R < 50") and is flat and even double-peaked elsewhere. 



4.2. Generalized Mass Models 



With a constant M/L model ruled out, the next question is what the actual radial distribution 
of mass is in the system. The simple SIS model is seen to be inaccurate: the ctq mass parameters 
derived separately from the stellar and GC constraints are inconsistent at the 99% confidence level 
(sec Fig. 4 and Table 1). That the GC-derived mass is higher than the stellar-derived mass implies 
that the radial mass profile M(r) rises more steeply than isothermal (oc r) in the outer parts. To 
make an initial estimation of M(r), we use the spherical isotropic Jeans equations (see Sargent et 
al. 1978 equation 11 and compare CR Fig. 6), modeling the stellar and GC subsystems separately. 
The stellar constraints imply M ~ r°-^ out to r ~ 150" ~ 10 kpc, while the GC constraints imply 
M ~ r^-^ outside this radius (see Fig. 9). These crude Jeans models thus suggest that it would be 
fruitful to further explore two-component mass models consisting of a constant M/L galaxy plus a 
dark halo with M(r) steeper than isothermal in the region of r 100"-500" (7-36 kpc). 
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Fig. 9. — The radial mass profile of M87. The heavy solid lines show estimates from the Jeans equations 
using the stars and globular clusters separately. The long-dashed lines show confidence limits from the X-ray 
analysis of Nulsen & Bohringer (1995). The other lines show the best orbit models fit to the combined star 
and GC data, for the constant M/L model (dotted), the SIS model (dashed), and the NFW2 model (light 
solid). The point with error bars shows the estimated mass of the central supermassive black hole (Macchetto 
et al. 1997). 
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We next use our orbit modeling methods to examine such a generahzed mass model, where the 

form of the dark halo is based on the the predictions from simulations of cosmologically-motivated 
structure formation {e.g., Navarro, Prenk, &; White 1996). This model has a total radial mass 
density profile of 

p{r) = TBMr)+ 



r(r + rs)2' 

where is the break radius between the halo's inner profile and outer profile. We construct 
mass models with a variety of the parameters {T b, Po,rs), fitting the combined stellar and GC 
constraints, and report the results of our three best models in Table 2 (see also Figs. 5 and 6). 
Note that each distinct model begins with a set of kernels calculated for two fixed parameters 
{Tb/ Po^Ts), and may then have its overall mass normalization scaled linearly to best fit the data. 



Model name 




Model Parameters 




AlnC 


Tb {Mq/Lq^b) 


(70 (km s-i) por^s (lO^^Mo) 


rs ("; kpc) 


Constant M/L 


9.3 






208.3 (20.3 a) 


SIS 




301 




3.7 (2.3 a) 


NFWl 


5.6 


0.9 


500 ; 36 


0.4 (0.8 a) 


NFW2 


5.9 


2.4 


939.3 ; 68 




NFW3 


6.5 


4.8 


1800 ; 131 


0.6 (0.9 a) 



Table 2. — M87 model solutions, given combined stellar and globular cluster constraints. The best-fit 
model parameters are given for each model (see text for description). The difference in log- likelihood between 
each model and the best overall model (NFW2) is shown, along with the statistical significance at which the 
model is dispreferred. 

AH these models (NFWl, NFW2, and NFW3) can produce better fits to the fuh data set than 
can the best SIS model (at the 98% significance level), and thus probably better approximate the 
overall mass distribution. The mass profiles of these models are consistent with M(r) derived from 
the galaxy's X-ray halo (Nulsen Sz Bohringer 1995), both in mass normalization and in density 
exponent (see Fig. 9). Both our NFW models and the X-ray model have density profiles that 
decline more gradually than isothermal in the galaxy's outer parts {e.g., p ~ r^^'^ at r ~ 200" ~ 
15 kpc), giving a circular velocity curve Vc{r) that continues rising at large radii (Fig. 10). Most 
elliptical galaxies have a constant or declining Vc{r) at these radii (Gerhard et al. 2001; the notable 
exception is the Fornax cD galaxy NGC 1399), so this is strong evidence that the dark matter at 
large radii is associated with the Virgo Gluster itself. 
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Fig. 10. — Circular velocity profile of M87, for different models. Lines styles are as in Fig. 9, with the 
addition of the NFWl [dot-short dashed) and NFW3 [dot-long dashed) models. The vertical dashed line 
indicates the outer radius of the region constrained by the GC data. 

A caveat about these new models is that the mass normalization from the stellar and GC 
data separately is again inconsistent (at the 1-2 a level). This may indicate that a still more 
accurate mass model M(r) could be found to fit all the existing constraints (possibly one with 
a larger, less-concentrated cluster halo as in McLaughlin 1995), and/or that one or both of the 
dynamical subsystems (stars and GCs) would be better modeled without the assumption of spherical 
symmetry. Further exploration of more detailed models would be useful, but outside the scope of 
this project. 



4.3. Orbit Anisotropies 

We have not implemented any method of rigorously exploring the range of DF functional forms 
permitted by the data, so the following results are meant to be suggestive rather than conclusive. 
Fig. 11 illustrates the stellar orbital characteristics of the overall best-fit solutions, where the 
mean anisotropy is characterized by the parameter /3 = 1 — {vg)/{v'^). The NFW solutions have a 
somewhat radial DF in the region constrained by the kinematical data (r ~ 0'.'15-170" ~ 0.01-12 
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kpc). The downwards dive in /3 near r = O'.'T is due to the sharp turnover in <Tp(i?) inside that 
radius. Real galaxies presumably have smooth /9(r) profiles, but with no smoothness imposed, our 
models are somewhat noisy because of the ill-conditioning of this inversion problem. Also, the 
region r ^ 1" is within the radius of influence of the central supermassive black hole (with mass 
Mbh = 3.2 lb 0.9 X 10^ Mq; Macchetto et al. 1997), and thus has not been accurately treated by our 
models. In summary, our best models indicate mild radial orbit anisotropy (/? = 0.2-0.5) for the 
well-constrained stellar region of the galaxy (r ^ l"-30" ~ 0.1-2 kpc) — a characteristic which has 
turned out to be quite typical for bright elliptical galaxies (see Gerhard et al. 2001 and references 
therein) . 
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Fig. 11. — Stellar orbital anisotropy as a function of radius, for the overall best-fit solutions from five differ- 
ent mass models: constant M/L {dotted line), SIS {solid line), NFWI {dot-short dashed line), NFW2 {dashed 
line), and NFW3 {dot-long dashed line). The vertical dashed lines indicate the radial regions constrained by 
the stellar kinematical data. 



Fig. 12 shows the orbital characteristics of the globular cluster system for the overall best- 
fit solutions. There is considerable variation between the best NFW solutions, so we cannot yet 
strongly constrain the GC anisotropy, except to note that it seems roughly isotropic at large radii 
(r ~ 100"-500" ~ 10-40 kpc). This seems to contradict the tangential anisotropy one would expect 
given the fiat and double-peaked LOSVDs seen in the data and in the models that fit only the GC 
data (see §2.4 and Figs. 7 and 8); — illustrating further the need for a mass model that better fits 
both the stellar and GC constraints. Note that the solutions' radial anisotropy at r ^ 40" ^ 3 kpc 
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should not be considered highly significant since there are only 8 GC velocities measured inside 
that projected radius. 




Fig. 12. — Velocity anisotropy of globular cluster system as a function of radius, for different mass models 
(see Fig. 11 for line descriptions). The vertical dashed lines indicate the radial region constrained by the 
GC velocity data. 



5. CONCLUSIONS 

We have developed a very general spherical orbit modeling method that makes full use of all 
the information in a set of discrete velocity data, and have applied it to the giant elliptical galaxy 
M87 using all the available dynamical constraints (measurements of the spatial distribution and 
the kinematics of the stars and of the globular clusters). By comparing two simple mass models — 
a constant mass-to-light ratio galaxy and a singular isothermal sphere — we find that a constant 
M/L is ruled out to a high level of significance. 

We further implement some more generalized mass models in order to more accurately represent 
the galaxy as consisting of a constant M/L stellar system plus a dark halo. We find improved 
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solutions which have a dark halo density profile decreasing more slowly than ~ at large radii 
(10-40 kpc). This very extended halo seems best interpreted as belonging to the Virgo Cluster 
itself, rather than to M87. 

There is substantial room for improvement of our M87 models. First, measurements of higher- 
order velocity moments for the stars arc available to only 0.3i?cff- Such measurements at larger 
radii would be quite helpful in better constraining the models, and can be most efficiently obtained 
by measuring discrete velocities of the galaxy's planetary nebulae. Second, a larger range of mass 
models could be explored to find solutions that fit the stellar and GC data together more consis- 
tently. And third, future models could relax our assumption of spherical symmetry, providing a 
more accurate model of the system in its outer parts, where flattening is seen in both the stellar 
and GC systems. 

The same techniques should also be applied to other, non-cD giant elliptical galaxies in clusters 
(such as NGC 4472), as well as to galaxies in the field. The different formational histories of these 
galaxies (via mergers, accretion, etc.) may result in significantly different mass distributions and 
dynamics. Furthermore, the colors, abundances, and spatial distributions of the GCs in M87 
and other galaxies indicate that they comprise two or more distinct populations {e.g., Neilsen & 
Tsvetanov 1999; Gebhardt Sz Kissler-Patig 1999). It would be interesting to use our orbit modeling 
methods to look for any dynamical differences between these populations, which may shed light on 
their formation histories, for which several different scenarios have been proposed {e.g., Ashman &: 
Zepf 1992; Forbes, Brodie, & Grillmair 1997; Harris, Harris, k McLaughlin 1998; Cote, Marzke, 
&; West 1998). The available set of kinematical tracers observed in the outer parts of early- type 
galaxies is now increasing rapidly, and we are hopeful that a much clearer picture of galaxy halos 
is forthcoming. 
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